A General Theory of Pathwise Coordinate Optimization for Nonconvex Sparse Learning∗
نویسندگان
چکیده
The pathwise coordinate optimization is one of the most important computational frameworks for solving high dimensional convex and nonconvex sparse learning problems. It differs from the classical coordinate optimization algorithms in three salient features: warm start initialization, active set updating, and strong rule for coordinate preselection. These three features grant superior empirical performance, but also pose significant challenge to theoretical analysis. To tackle this long lasting problem, we develop a new theory showing that these three features play pivotal roles in guaranteeing the outstanding statistical and computational performance of the pathwise coordinate optimization framework. In particular, we analyze the existing methods for pathwise coordinate optimization and provide new theoretical insights into them. The obtained theory motivates the development of several modifications to improve the pathwise coordinate optimization framework, which guarantees linear convergence to a unique sparse local optimum with optimal statistical properties (e.g. minimax optimality and oracle properties). This is the first result establishing the computational and statistical guarantees of the pathwise coordinate optimization framework in high dimensions. Thorough numerical experiments are provided to back up our theory.
منابع مشابه
Pathwise Coordinate Optimization for Sparse Learning: Algorithm and Theory
The pathwise coordinate optimization is one of the most important computational frameworks for high dimensional convex and nonconvex sparse learning problems. It differs from the classical coordinate optimization algorithms in three salient features: warm start initialization, active set updating, and strong rule for coordinate preselection. Such a complex algorithmic structure grants superior ...
متن کاملPathwise Coordinate Optimization for Sparse
The pathwise coordinate optimization is one of the most important computational frameworks for high dimensional convex and nonconvex sparse learning problems. It differs from the classical coordinate optimization algorithms in three salient features: warm start initialization, active set updating, and strong rule for coordinate preselection. Such a complex algorithmic structure grants superior ...
متن کاملThe picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R
We describe an R package named picasso, which implements a unified framework of pathwise coordinate optimization for a variety of sparse learning problems (Sparse Linear Regression, Sparse Logistic Regression and Sparse Column Inverse Operator), combined with distinct active set identification schemes (truncated cyclic, greedy, randomized and proximal gradient selection). Besides, the package p...
متن کاملSparseNet: Coordinate Descent With Nonconvex Penalties.
We address the problem of sparse selection in linear models. A number of nonconvex penalties have been proposed in the literature for this purpose, along with a variety of convex-relaxation algorithms for finding good solutions. In this article we pursue a coordinate-descent approach for optimization, and study its convergence properties. We characterize the properties of penalties suitable for...
متن کاملAccelerated Block Coordinate Proximal Gradients with Applications in High Dimensional Statistics
Nonconvex optimization problems arise in different research fields and arouse lots of attention in signal processing, statistics and machine learning. In this work, we explore the accelerated proximal gradient method and some of its variants which have been shown to converge under nonconvex context recently. We show that a novel variant proposed here, which exploits adaptive momentum and block ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016